Bruce Schatz

نویسندگان

  • Bruce Schatz
  • William H. Mischo
  • Timothy W. Cole
  • Joseph B. Hardin
  • Hsinchun Chen
چکیده

Computer he most important recorded information medium on the Internet, and in the world at large, is the document. Although text might seem prosaic in contrast to multimedia objects, it is still the major medium for communicating information. Internet document retrieval can draw upon years of research results and practical experience in on-line information access as well as from traditional physical libraries, The technology for text information retrieval is far more mature than that for other media. Therefore, documents are also the best vehicle for investigating problems specific to digital libraries, such as the federation problem of making distributed collections of heterogeneous materials appear to be a single integrated collection. The Digital Library Initiative (DLI) project at the University of Illinois at Urbana-Champaign is developing the information infrastructure to effectively search technical documents on the Internet. We are constructing a large testbed of scientific literature, evaluating its effectiveness under significant use, and researching enhanced search technology. We are building repositories (organized collections) of indexed multiplesource collections and federating (merging and mapping) them by searching the material via multiple views of a single virtual collection. Developingwidely usable Web technology is also a key goal. Improving Web search beyond full-text retrieval will require using document structure in the short term and document semantics in the long term. Our testbed efforts concentrate on journal articles from the scientific literature, with structure specified by the Standard Generalized Markup Language (SGML). Our research efforts extract semantics from documents using the scalable technology of concept spaces based on context frequency. We then merge these efforts with traditional library indexing to provide a single Internet interface to indexes of multiple repositories. Our project focuses on developing a large-scale infrastructure adequate for solving real-world problems. The Testbed part of the project is based in the University Library in a new facility that showcases engineering and science information and literature. We are placing article files into the digital library on a production basis in SGML directly from major engineering and science publishers. The National Center for Supercomputing Applications (NCSA) is developing software for the Internet version in an attempt to make server-side repository search as widely available as its Mosaic software made client-side document browsing1 The Research section of the project is using NCSA supercomputers to compute indexes for new search techniques on large collections, to simulate the future world, and to provide new technology for the Testbed section.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

of Scientific literature

Bruce Schatz William Mischo Timothy Cole Ann he Digital Libraries Initiative (DLI) project at the University of Illinois at UrbanaChampaign (UIUC) was one of six sponsored by the NSF, DARPA, and NASA from 1994 through 1998. Our goal was to develop widely usable Web technology to effectively search technical documents on the Internet. We concentrated Susan Harum Eric Johnson Laura Neumann Univer...

متن کامل

isotope effects for a factor of 36.1 in isotopic mass

Kinetics of the reaction of the heaviest hydrogen atom with H2, the 4Heμ + H2 → 4HeμH + H reaction: Experiments, accurate quantal calculations, and variational transition state theory, including kinetic isotope effects for a factor of 36.1 in isotopic mass Donald G. Fleming,1,a) Donald J. Arseneau,1 Oleksandr Sukhorukov,1,b) Jess H. Brewer,2 Steven L. Mielke,3 Donald G. Truhlar,3,a) George C. S...

متن کامل

Concept Extraction in the Interspace Prototype

A comparison of four parsers was undertaken for noun phrase extraction − FastNPE, NPtool, Chopper, and AZ Phraser. FastNPE was found to be the fastest of the parsers, and NPtool the most accurate in extracting noun phrases. Both were subsequently implemented into the Concept Extractor module of the Interspace Prototype, which is described in detail.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004